Goto

Collaborating Authors

 survival probability



Enabling Delayed-Full Charging Through Transformer-Based Real-Time-to-Departure Modeling for EV Battery Longevity

Lee, Yonggeon, Hwang, Jibin, Kondoro, Alfred Malengo, Song, Juhyun, Noh, Youngtae

arXiv.org Artificial Intelligence

Electric vehicles (EVs) are key to sustainable mobility, yet their lithium-ion batteries (LIBs) degrade more rapidly under prolonged high states of charge (SOC). This can be mitigated by delaying full charging \ours until just before departure, which requires accurate prediction of user departure times. In this work, we propose Transformer-based real-time-to-event (TTE) model for accurate EV departure prediction. Our approach represents each day as a TTE sequence by discretizing time into grid-based tokens. Unlike previous methods primarily dependent on temporal dependency from historical patterns, our method leverages streaming contextual information to predict departures. Evaluation on a real-world study involving 93 users and passive smartphone data demonstrates that our method effectively captures irregular departure patterns within individual routines, outperforming baseline models. These results highlight the potential for practical deployment of the \ours algorithm and its contribution to sustainable transportation systems.



Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

Park, Sihyung, Lu, Wenbin, Yang, Shu

arXiv.org Machine Learning

We introduce a principal stratification-based method, focusing on the always-survivor value function. We derive a semiparametrically efficient, multiply robust estimator for multi-stage DTRs, demonstrating its robustness and efficiency. Empirical validation and an application to electronic health records showcase its utility for personalized treatment optimization.


Emergent Risk Awareness in Rational Agents under Resource Constraints

Ornia, Daniel Jarne, Bishop, Nicholas, Dyer, Joel, Lee, Wei-Chen, Calinescu, Ani, Farmer, Doyne, Wooldridge, Michael

arXiv.org Artificial Intelligence

Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision-making problems under (approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade-offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a human principal to act on their behalf, asymmetries in constraint exposure can give rise to previously unanticipated misalignment between human objectives and agent incentives. We formalise this setting through a survival bandit framework, provide theoretical and empirical results that quantify the impact of survival-driven preference shifts, identify conditions under which misalignment emerges and propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours. As a result, this work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under such survival pressure, and offer guidelines for safely deploying such AI systems in critical resource-limited environments.


KM-GPT: An Automated Pipeline for Reconstructing Individual Patient Data from Kaplan-Meier Plots

Zhao, Yao, Sun, Haoyue, Ding, Yantian, Xu, Yanxun

arXiv.org Machine Learning

Reconstructing individual patient data (IPD) from Kaplan-Meier (KM) plots provides valuable insights for evidence synthesis in clinical research. However, existing approaches often rely on manual digitization, which is error-prone and lacks scalability. To address these limitations, we develop KM-GPT, the first fully automated, AI-powered pipeline for reconstructing IPD directly from KM plots with high accuracy, robustness, and reproducibility. KM-GPT integrates advanced image preprocessing, multi-modal reasoning powered by GPT-5, and iterative reconstruction algorithms to generate high-quality IPD without manual input or intervention. Its hybrid reasoning architecture automates the conversion of unstructured information into structured data flows and validates data extraction from complex KM plots. To improve accessibility, KM-GPT is equipped with a user-friendly web interface and an integrated AI assistant, enabling researchers to reconstruct IPD without requiring programming expertise. KM-GPT was rigorously evaluated on synthetic and real-world datasets, consistently demonstrating superior accuracy. To illustrate its utility, we applied KM-GPT to a meta-analysis of gastric cancer immunotherapy trials, reconstructing IPD to facilitate evidence synthesis and biomarker-based subgroup analyses. By automating traditionally manual processes and providing a scalable, web-based solution, KM-GPT transforms clinical research by leveraging reconstructed IPD to enable more informed downstream analyses, supporting evidence-based decision-making.


Statistical Model Checking of NetLogo Models

Pangallo, Marco, Giachini, Daniele, Vandin, Andrea

arXiv.org Artificial Intelligence

Agent-based models (ABMs) are gaining increasing traction in several domains, due to their ability to represent complex systems that are not easily expressible with classical mathematical models. This expressivity and richness come at a cost: ABMs can typically be analyzed only through simulation, making their analysis challenging. Specifically, when studying the output of ABMs, the analyst is often confronted with practical questions such as: (i) how many independent replications should be run? (ii) how many initial time steps should be discarded as a warm-up? (iii) after the warm-up, how long should the model run? (iv) what are the right parameter values? Analysts usually resort to rules of thumb and experimentation, which lack statistical rigor. This is mainly because addressing these points takes time, and analysts prefer to spend their limited time improving the model. In this paper, we propose a methodology, drawing on the field of Statistical Model Checking, to automate the process and provide guarantees of statistical rigor for ABMs written in NetLogo, one of the most popular ABM platforms. We discuss MultiVeStA, a tool that dramatically reduces the time and human intervention needed to run statistically rigorous checks on ABM outputs, and introduce its integration with NetLogo. Using two ABMs from the NetLogo library, we showcase MultiVeStA's analysis capabilities for NetLogo ABMs, as well as a novel application to statistically rigorous calibration. Our tool-chain makes it immediate to perform statistical checks with NetLogo models, promoting more rigorous and reliable analyses of ABM outputs.


The C-index Multiverse

Sierra, Begoña B., McLean, Colin, Hall, Peter S., Vallejos, Catalina A.

arXiv.org Machine Learning

Quantifying out-of-sample discrimination performance for time-to-event outcomes is a fundamental step for model evaluation and selection in the context of predictive modelling. The concordance index, or C-index, is a widely used metric for this purpose, particularly with the growing development of machine learning methods. Beyond differences between proposed C-index estimators (e.g. Harrell's, Uno's and Antolini's), we demonstrate the existence of a C-index multiverse among available R and python software, where seemingly equal implementations can yield different results. This can undermine reproducibility and complicate fair comparisons across models and studies. Key variation sources include tie handling and adjustment to censoring. Additionally, the absence of a standardised approach to summarise risk from survival distributions, result in another source of variation dependent on input types. We demonstrate the consequences of the C-index multiverse when quantifying predictive performance for several survival models (from Cox proportional hazards to recent deep learning approaches) on publicly available breast cancer data, and semi-synthetic examples. Our work emphasises the need for better reporting to improve transparency and reproducibility. This article aims to be a useful guideline, helping analysts when navigating the multiverse, providing unified documentation and highlighting potential pitfalls of existing software. All code is publicly available at: www.github.com/BBolosSierra/CindexMultiverse.


Reduction Techniques for Survival Analysis

Piller, Johannes, Orsini, Léa, Wiegrebe, Simon, Zobolas, John, Burk, Lukas, Langbein, Sophie Hanna, Studener, Philip, Goeswein, Markus, Bender, Andreas

arXiv.org Machine Learning

In this work, we discuss what we refer to as reduction techniques for survival analysis, that is, techniques that "reduce" a survival task to a more common regression or classification task, without ignoring the specifics of survival data. Such techniques particularly facilitate machine learning-based survival analysis, as they allow for applying standard tools from machine and deep learning to many survival tasks without requiring custom learners. We provide an overview of different reduction techniques and discuss their respective strengths and weaknesses. We also provide a principled implementation of some of these reductions, such that they are directly available within standard machine learning workflows. We illustrate each reduction using dedicated examples and perform a benchmark analysis that compares their predictive performance to established machine learning methods for survival analysis.


Predicting the Lifespan of Industrial Printheads with Survival Analysis

Parii, Dan, Janssen, Evelyne, Tang, Guangzhi, Kouzinopoulos, Charalampos, Pietrasik, Marcin

arXiv.org Artificial Intelligence

Personal use of this material is permitted. This paper has been published in the 8th IEEE Conference on Industrial Cyber-Physical Systems (ICPS) in Emden, Germany, May 12-15, 2025. Abstract --Accurately predicting the lifespan of critical device components is essential for maintenance planning and production optimization, making it a topic of significant interest in both academia and industry. In this work, we investigate the use of survival analysis for predicting the lifespan of production printheads developed by Canon Production Printing. Specifically, we focus on the application of five techniques to estimate survival probabilities and failure rates: the Kaplan-Meier estimator, Cox proportional hazard model, Weibull accelerated failure time model, random survival forest, and gradient boosting. The resulting estimates are further refined using isotonic regression and subsequently aggregated to determine the expected number of failures. The predictions are then validated against real-world ground truth data across multiple time windows to assess model reliability. Our quantitative evaluation using three performance metrics demonstrates that survival analysis outperforms industry-standard baseline methods for printhead lifespan prediction.